Semi-automatic grammar recovery

نویسندگان

  • Ralf Lämmel
  • Chris Verhoef
چکیده

We propose an approach to the construction of grammars for existing languages. The main characteristic of the approach is that the grammars are not constructed from scratch but they are rather recovered by extracting them from language references, compilers, and other artifacts. We provide a structured process to recover grammars including the adaptation of raw extracted grammars and the derivation of parsers. The process is applicable to possibly all existing languages for which business critical applications exist. We illustrate the approach with a non-trivial case study. Using our process and some basic tools, we constructed in a few weeks a complete and correct VS COBOL II grammar specification for IBM mainframes. In addition, we constructed a parser for VS COBOL II, and were the first to publish a (web-enabled) grammar specification so that others can use this result to construct their own grammar-based tools for VS COBOL II or derivatives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Amsterdam Toolkit for Language Archaeology

GRK — the Grammar Recovery Kit — illustrates options for automation and corresponding tool support in the context of developing quality language references that readily cater for the derivation of parsers. GRK provides the proof-of-concept for two notions: (i) semi-automatic grammar recovery; (ii) language-reference re-engineering. GRK’s support for semi-automatic grammar recovery means that GR...

متن کامل

Title : MARS : A Metamodel Recovery System Using Grammar Inference

100 words): Domain-specific modeling (DSM) assists subject matter experts in describing the essential characteristics of a problem in their domain. When a metamodel is lost, repositories of domain models can become orphaned from their defining metamodel. Within the purview of model-driven engineering, the ability to recover the design knowledge in a repository of legacy models is needed. In thi...

متن کامل

Semi-automatic acquisition of domain-specific semantic structures

This paper describes a methodology for semi-automatic grammar induction from unannotated corpora belonging to a restricted domain. The grammar contains both semantic and syntactic structures, which are conducive towards language understanding. Our work aims to ameliorate the reliance of grammar development on expert handcrafting or the availability of annotated corpora. To strive for a reasonab...

متن کامل

Learning Strategies In A Grammar Induction Framework

This work extends a semi-automatic grammar induction approach previously proposed in [1]. We investigate the use of Information Gain (IG) in place of Mutual Information (MI) for grammar induction based on an unannotated training corpus. Experiments using the ATIS-3 training corpus indicate that the use of IG led to better precision and recall of desired semantic categories and at earlier stages...

متن کامل

MARS: A metamodel recovery system using grammar inference

Domain-specific modeling (DSM) assists subject matter experts in describing the essential characteristics of a problem in their domain. Various software artifacts can be generated from the defined models, including source code, design documentation, and simulation scripts. The typical approach to DSM involves the construction of a metamodel, from which instances are defined that represent speci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Softw., Pract. Exper.

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2001